A Rough Set Approach to Data with Missing Attribute Values

نویسنده

  • Jerzy W. Grzymala-Busse
چکیده

In this paper we discuss four kinds of missing attribute values: lost values (the values that were recorded but currently are unavailable), ”do not care” conditions (the original values were irrelevant), restricted ”do not care” conditions (similar to ordinary ”do not care” conditions but interpreted differently, these missing attribute values may occur when in the same data set there are lost values and ”do not care” conditions), and attribute-concept values (these missing attribute values may be replaced by any attribute value limited to the same concept). Through the entire paper the same calculus, based on computations of blocks of attribute-value pairs, is used. Incomplete data are characterized by characteristic relations, which in general are neither symmetric nor transitive. Lower and upper approximations are generalized for data with missing attribute values. Finally, some experiments on different interpretations of missing attribute values and different approximation definitions are cited.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Incomplete Data with Many Missing Attribute Values A Comparison of Probabilistic and Rough Set Approaches

In this paper, we study probabilistic and rough set approaches to missing attribute values. Probabilistic approaches are based on imputation, a missing attribute value is replaced either by the most probable known attribute value or by the most probable attribute value restricted to a concept. In this paper, in a rough set approach to missing attribute values we consider two interpretations of ...

متن کامل

A comparison of traditional and rough set approaches to missing attribute values in data mining

Real-life data sets are often incomplete, i.e., some attribute values are missing. In this paper we compare traditional, frequently used methods of handling missing attribute values, which are based on preprocessing, with another class of methods dealing with missing attribute values in which rule induction is performed directly on incomplete data sets, i.e., handling missing attribute values a...

متن کامل

A Comparative Study on Decision Rule Induction for incomplete data using Rough Set and Random Tree Approaches

Handling missing attribute values is the greatest challenging process in data analysis. There are so many approaches that can be adopted to handle the missing attributes. In this paper, a comparative analysis is made of an incomplete dataset for future prediction using rough set approach and random tree generation in data mining. The result of simple classification technique (using random tree ...

متن کامل

Comparisons on Different Approaches to Assign Missing Attribute Values

A commonly-used and naive solution to process data with missing attribute values is to ignore the instances which contain missing attribute values. This method may neglect important information within the data, significant amount of data could be easily discarded, and the discovered knowledge may not contain significant rules. Some methods, such as assigning the most common values or assigning ...

متن کامل

Rough set approach to incomplete numerical data

The theory rough set successfully implemented diberbagai sector , but rough set model classical can only associated with the data complete and set of data in symbolic form ( Jianhua , dai , 2013 ) . Research by adopting the theory rough set conducted in attribute numerical and the value of an attribute lost ( Jerzy w , Grzymala-Buse and Zdzislaw S , Hippe , Nov .2011 ) .In this research discuss...

متن کامل

Three Approaches to Missing Attribute Values: A Rough Set Perspective

A new approach to missing attribute values, based on the idea of an attribute-concept value, is studied in the paper. This approach, together with two other approaches to missing attribute values, based on "do not care" conditions and lost values are discussed using rough set methodology, including attribute-value pair blocks, characteristic sets, and characteristic relations. Characteristic se...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006